A Lexicon for Old Occitan Medico-Botanical Terminology in Lemon
نویسندگان
چکیده
The article presents the adaptation of the lemon model (a model for lexica as RDF data) for a multilingual and multi-alphabetical lexicon of Old Occitan medico-botanical terminology. The lexicon is the core component of an ontology-based information system that will be constructed and implemented within the DFG-funded project "Dictionnaire de Termes Médico-botaniques de l’Ancien Occitan" (DiTMAO). The difficulties for the lemmatization raised by the particularities of the corpus (terms in Latin, Hebrew and Arabic script and corresponding terms in other ancient languages, mostly Hebrew and Arabic) can be perfectly solved by extending the basic properties of lemon and introducing domain specific vocabulary.
منابع مشابه
Pos-tagging different varieties of Occitan with single-dialect resources
In this study, we tackle the question of pos-tagging written Occitan, a lesser-resourced language with multiple dialects each containing several varieties. For pos-tagging, we use a supervised machine learning approach, requiring annotated training and evaluation corpora and optionally a lexicon, all of which were prepared as part of the study. Although we evaluate two dialects of Occitan, Leng...
متن کاملAlpine ethnobotany in Italy: traditional knowledge of gastronomic and medicinal plants among the Occitans of the upper Varaita valley, Piedmont
A gastronomic and medical ethnobotanical study was conducted among the Occitan communities living in Blins/Bellino and Chianale, in the upper Val Varaita, in the Piedmontese Alps, North-Western Italy, and the traditional uses of 88 botanical taxa were recorded. Comparisons with and analysis of other ethnobotanical studies previously carried out in other Piemontese and surrounding areas, show th...
متن کاملBuilding an old Occitan corpus via cross-Language transfer
This paper describes the implementation of a resource-light approach, cross-language transfer, to build and annotate a historical corpus for Old Occitan. Our approach transfers morpho-syntactic and syntactic annotation from resource-rich source languages, Old French and Catalan, to a genetically related target language, Old Occitan. The present corpus consists of three sub-corpora in XML format...
متن کاملThe VOLEM Project : a Framework for the Construction of Advanced Multilingual Lexicons
We report in this short document the results of a Regional European project carried out on Spanish, Catalan, Occitan and French whose aim is to design a lexical knowledge base where syntactic and semantic descriptions have been normalized and are treated in a uniform way cross-linguistically. Besides the scientific aspects, one of the aims is to make less developed languages such as Occitan or ...
متن کاملBaTelÒc: A Text Base for the Occitan Language1
Language Documentation, as defined by Himmelmann (2006), aims at compiling and preserving linguistic data for studies in linguistics, literature, history, ethnology, sociology. This initiative is vital for endangered languages such as Occitan, a romance language spoken in southern France and in several valleys of Spain and Italy. The documentation of a language concerns all its modalities, cove...
متن کامل